-
Notifications
You must be signed in to change notification settings - Fork 268
Adding uint4 dtype implementation #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: fc5acbb Pull Request resolved: #13
torchao/dtypes/int4.py
Outdated
uint8_data = uint8_data.contiguous().view(-1) | ||
return (uint8_data[::2] << 4 | uint8_data[1::2]).view(down_size(shape)) | ||
|
||
class UInt4Tensor(torch.Tensor): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're creating a whole new Tensor here, but really we want to extend a QTensor with a new lower precision backend? Or do we want to combine UInt4Tensor with a QTensor, where QTensor has a UInt4Tensor for storing int4 weights (next to scales etc.)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by QTensor do you mean https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/quantized/QTensorImpl.h? I think we are moving away from this and just rely on PyTorch native dtypes now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cpuhrsch I guess by QTensor you mean these tensors: https://github.com/pytorch-labs/ao/blob/main/torchao/quantization/subclass.py#L178, This UInt4Tensor can compose with that (it's similar to uint8 tensor), see the example below (PerChannelSymmetricWeightUInt4Tensor)
The next major issue I think you'll hit is that with |
oh the suggestion from Alban is that we'll desugar it for now, since triton don't have int4 support anyways and we are using handwritten custom kernels, the int4 is just for representation in frontend |
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 1a20256 Pull Request resolved: #13
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: d9c853f Pull Request resolved: #13
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: aa80ed3 Pull Request resolved: #13
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 6647651 Pull Request resolved: #13
Are you planning to add int4 dtype from ATen C++ side? |
no we don't, we'll use bits8 instead, and encode int4 related information in ops that is using these tensors, e.g. |
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 812be7a Pull Request resolved: #13
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 05f1c94 Pull Request resolved: #13
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 9a07ab7 Pull Request resolved: #13
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 0277ba4 Pull Request resolved: #117208
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: #117208 Approved by: https://github.com/ezyang
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: be18a64 Pull Request resolved: #13
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 888f289 Pull Request resolved: #13
zero_point: int, | ||
) -> torch.Tensor: | ||
inv_scale = 1.0 / scale | ||
return pack_uint4(torch.clamp(torch.round(input * inv_scale) + zero_point, 0, 15).to(torch.uint8)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for inductor and export, how do you want these routines to be optimized? Is there a custom kernel for the quantize/dequantizing? Or would you prefer to for inductor to "see" this decomposition, so it can try to fuse it into a triton kernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be the latter for now, but maybe in the future we can have custom kernels
self, size = args | ||
size = utils.infer_size(size, self.numel()) | ||
assert not kwargs | ||
# WARNING: views not preserved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why aren't we preserving the view?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is copied from Edward's original PR, not exactly sure why, feels like this is creating a new tensor so we don't have view relationship? is view supposed to share storage?
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 4b6082c Pull Request resolved: #13
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Summary: This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this we plan to move the uint4 tensor subclass to core after it is more mature Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 51cf717 Pull Request resolved: #13
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 51cf717 Pull Request resolved: #13
Summary: We have a lot of interest for int4 dtypes, and we'd like to add the dtype out of PyTorch core. This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this Test Plan: python test/dtypes/test_int4.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 51cf717 Pull Request resolved: pytorch#13
Stack from ghstack (oldest at bottom):
Summary:
This PR added some preliminary support for uint4 through tensor subclass and we'll continue to iterate on this
we plan to move the uint4 tensor subclass to core after it is more mature
Test Plan:
python test/dtypes/test_int4.py
Reviewers:
Subscribers:
Tasks:
Tags: